Posterior predictive checks to quantify lack-of-fit in admixture models of latent population structure.

نویسندگان

  • David Mimno
  • David M Blei
  • Barbara E Engelhardt
چکیده

Admixture models are a ubiquitous approach to capture latent population structure in genetic samples. Despite the widespread application of admixture models, little thought has been devoted to the quality of the model fit or the accuracy of the estimates of parameters of interest for a particular study. Here we develop methods for validating admixture models based on posterior predictive checks (PPCs), a Bayesian method for assessing the quality of fit of a statistical model to a specific dataset. We develop PPCs for five population-level statistics of interest: within-population genetic variation, background linkage disequilibrium, number of ancestral populations, between-population genetic variation, and the downstream use of admixture parameters to correct for population structure in association studies. Using PPCs, we evaluate the quality of the admixture model fit to four qualitatively different population genetic datasets: the population reference sample (POPRES) European individuals, the HapMap phase 3 individuals, continental Indians, and African American individuals. We found that the same model fitted to different genomic studies resulted in highly study-specific results when evaluated using PPCs, illustrating the utility of PPCs for model-based analyses in large genomic studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring Confidence in Temporal Topic Models with Posterior Predictive Checks

Large text collections are useful in social science research, but building reliable predictive models is difficult. Researchers must either deal directly with sparse, noisy, high dimensional language data or use latent variable models to infer more tractable lower dimensional patterns. For conclusions based on latent variable models to be reliable, however, it is necessary to measure the degree...

متن کامل

Fitting Position Latent Cluster Models for Social Networks with latentnet.

latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoff, Raftery, and Handcock (2002) suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariat...

متن کامل

Posterior contraction of the population polytope in finite admixture models

We study the posterior contraction behavior of the latent population structure that arises in admixture models as the amount of data increases. An admixture model — alternatively known as a topic model — specifies k populations, each of which is characterized by a ∆-valued vector of frequencies for generating a set of discrete values in {0, 1, . . . , d}. The population polytope is defined as t...

متن کامل

A Bayesian approach to the selection and testing of latent class models

An important part of a latent class analysis concerns the selection of the number of latent classes. In this paper, we discuss the Bayes factor as a selection tool. The discussion will focus on two aspects: (i) the computation of the Bayes factor and (ii) prior sensitivity. To deal with prior sensitivity, we propose to extend the model with a prior for the hyperparameters. We further discuss th...

متن کامل

کاربرد مدل کلاس پنهان بیز در تعیین ارزش تشخیصی SPECT و MRI مغز جهت تشخیص حس بویایی بعد از تروما بدون حضور استاندارد طلایی

Abstract Introduction: The sense of smell gives unexplainable quality to human life. The  impairment In this sense will create lot of problems. MRI and SPECT are two way of olfactory evaluation that none of the both is not Gold standard. Bayesian latent class model is the correct way to determine the diagnostic value of these tests. Methods: MRI and SPECT tests performed on 63 patients e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 112 26  شماره 

صفحات  -

تاریخ انتشار 2015